Data deprivation, or the lack of easily available and actionable information on the well-being of individuals, is a significant challenge for the developing world and an impediment to the design and operationalization of policies intended to alleviate poverty. In this paper we explore the suitability of data derived from OpenStreetMap to proxy for the location of two crucial public services: schools and health clinics. Thanks to the efforts of thousands of digital humanitarians, online mapping repositories such as OpenStreetMap contain millions of records on buildings and other structures, delineating both their location and often their use. Unfortunately much of this data is locked in complex, unstructured text rendering it seemingly unsuitable for classifying schools or clinics. We apply a scalable, unsupervised learning method to unlabeled OpenStreetMap building data to extract the location of schools and health clinics in ten countries in Africa. We find the topic modeling approach greatly improves performance versus reliance on structured keys alone. We validate our results by comparing schools and clinics identified by our OSM method versus those identified by the WHO, and describe OSM coverage gaps more broadly.
translated by 谷歌翻译
节奏是复杂的结构,从对立的复合物的开始一直在推动音乐,直到今天。检测此类结构对于许多MIR任务,例如音乐分析,关键检测或音乐分割至关重要。但是,自动节奏检测仍然具有挑战性,主要是因为它涉及和谐,语音领导和节奏等高级音乐元素的结合。在这项工作中,我们提出了符号分数的图表表示,作为解决节奏检测任务的中间手段。我们使用图形卷积网络将节奏检测作为不平衡的节点分类问题。我们获得了与最新技术大致相当的结果,并且我们提出了一个模型,该模型能够以多个粒度的粒度进行预测,从单个音符到节拍,这要归功于良好的注释,注释。此外,我们的实验表明,图形卷积可以学习有助于节奏检测的非本地特征,从而使我们摆脱了必须设计编码非本地环境的专业特征。我们认为,这种建模音乐得分和分类任务的一般方法具有许多潜在的优势,而不是此处介绍的具体识别任务。
translated by 谷歌翻译
TimeSeries Partitioning是大多数机器学习驱动的传感器的IOT应用程序的重要步骤。本文介绍了一种采样效率,鲁棒,时序分割模型和算法。我们表明,通过基于最大平均差异(MMD)的分割目标来学习特定于分割目标的表示,我们的算法可以鲁布布地检测不同应用程序的时间序列事件。我们的损耗功能允许我们推断是否从相同的分布(空假设)中绘制了连续的样本序列,并确定拒绝零假设的对之间的变化点(即,来自不同的分布)。我们展示了其在基于环境传感的活动识别的实际IOT部署中的适用性。此外,虽然文献中存在许多关于变更点检测的作品,但我们的模型明显更简单,匹配或优于最先进的方法。我们可以平均地在9-93秒内完全培训我们的模型,而在不同应用程序上的数据的差异很小。
translated by 谷歌翻译